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1 Introduction 

The complex of free factors of a free group F of rank n is the simplicial 
complex J-" whose vertices are conjugacy classes of proper free factors A of 
F, and simplices are determined by chains Ai < A2 < ■ ■ ■ < Ak- The outer 
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automorphism group Out{¥) acts naturally on J^, which can be thought 
of as an analog of the Bruhat-Tits building associated with GLn{'L). This 
complex was introduced by Hatcher and Vogtmann in [18] where it is shown 
that it has the homotopy type of the wedge of spheres of dimension n — 2. 
They defined this complex in terms of sphere systems in #"5*^ x S"^ and 
used variants in their work on homological stability [17^ [T9t [T6] . 

There is a very useful analogy between T and the curve complex C 
associated with a compact surface (with punctures) S. The vertices of C 
are isotopy classes of essential simple closed curves in S, and simplices are 
determined by pairwise disjoint curves. The curve complex was introduced 
by Harvey [H] and was classically used by Harer in his work on duality and 
homological stability of mapping class groups [13^ ^^ . The key result here 
is that the curve complex is homotopy equivalent to the wedge of spheres. 

More recently, the curve complex has been used in the study of the 
geometry of mapping class groups and of ends of hyperbolic 3-manifolds. 
The fundamental result on which this work is based is the theorem of Masur 
and Minsky [22] that the curve complex is hyperbolic. In the low complexity 
cases when C is a discrete set one modifies the definition of C by adding an 
edge when the two curves intersect minimally. In the same way, we modify 
the definition of J- when the rank is n = 2 by adding an edge when the 
two free factors (necessarily of rank 1) are determined by a basis of F, i.e. 
whenever F = {a, b), then (a) and (b) span an edge. In this way J^ becomes 
the standard Farey graph. The main result in this paper is: 

Main Theorem. The complex T of free factors is hyperbolic. 

The statement simply means that when the 1-skeleton of J- is equipped 
with the path metric in which every edge has length 1, the resulting graph 
is hyperbolic. 

There are variants of the definition that give rise to quasi-isometric com- 
plexes. For example, one can take the complex of partial bases, where ver- 
tices are conjugacy classes of elements that are part of a basis, and simplices 
correspond to compatibility, i.e. subsets of a basis. The 74nt(F)-version of 
this complex was used in [10] to study the Torelli group. As another exam- 
ple, J-" is quasi-isometric to the nerve of the cover {U{A)} of the thin part 
of Outer space, where for a conjugacy class of proper free factors A, the 
set U{A) consists of those marked graphs whose ^-cover has core of volume 
< e, for a fixed small e > 0. 

Our proof is very much inspired by the Masur-Minsky argument, which 
uses Teichmiiller theory. Bowditch [7] gave a somewhat simpler argument. 
In the remainder of the introduction, we give an outline of the hyperbolicity 



of the curve complex which fohows [22] and [7] , and where we take a certain 
poetic hcense. 

The proof starts by defining a coarse projection tt : T ^- C from Te- 
ichmiiller space. To a marked Riemann surface X one associates a curve 
with smallest extremal length. To see that this is well defined one must 
argue that short curves intersect a bounded number of times (in fact, at 
most once), and then one uses the inequality 

dc{ot,l3) <i{a,l3) + l 

where i denotes the intersection number. Interestingly, the entire argument 
uses only this inequality to estimate distances in C (and only for bounded 
intersection numbers). 

Teichmiiller space carries the Teichmiiller metric, and any two points 
are joined by a unique Teichmiiller geodesic. If i i— > Xj is a Teichmiiller 
geodesic, consider the (coarse) path 7r(Xf) in C. One observes: 

(i) The collection of paths T^{Xt) is (coarsely) transitive, i.e. for any two 
curves a, /3 there is a path 7r(Xt) that connects a to /3 (to within a 
bounded distance). 

Next, for any Teichmiiller geodesic {Xt} one defines a projection C — t- 
{X(}. Essentially, for a curve a the projection assigns the Riemann surface 
Xt on the path in which a has the smallest length. The key lemma is 
the following (see [221 Lemma 5.8]), proved using the intersection number 
estimate above: 

(ii) If a, /3 are adjacent in C and Xq,, Xp are their projections to Xt^ then 
7r(Xa) and 7r(X^) are at uniformly bounded distance in C. 

Consequently, one has a (coarse) Lipschitz retraction C — )• i^{Xt) for 
every Teichmiiller geodesic Xt- It quickly follows that the paths T^{Xt) are 
reparametrized quasi-geodesics (this means that they could spend a long 
time in a bounded set, but after removing the corresponding subintervals 
the resulting coarse path is a quasi-geodesic with uniform constants, after 
possibly reparametrizing) . 

The final step is: 

(iii) Triangles formed by three projected Teichmiiller geodesies are uni- 
formly thin. 



Hyperbolicity of C now follows by an argument involving an isoperimetric 
inequality (see [71 Proposition 3.1] and our Proposition 16. ip . 

Our argument follows the same outline. In place of Teichmiiller met- 
ric and Teichmiiller geodesies we use the Lipschitz metric on Outer space 
and folding paths. There are technical complications arising from the non- 
symmetry of the Lipschitz metric and the non-uniqueness of folding paths 
between a pair of points in Outer space. Our projection from J-" to a fold- 
ing path comes in two flavors, left and right, and we have to work to show 
that the two are at bounded distance from each other when projected to J^. 
Similarly, we have to prove directly that projections of folding paths fellow 
travel, even when the two have opposite orientations. The role of simple 
closed curves is played by simple conjugacy classes in F, i.e. nontrivial con- 
jugacy classes contained in some proper free factor. 

The first two hints that Outer space has some hyperbolic features was 
provided by Yael Algom-Kfir's thesis [l] and by [5]. Algom-Kfir showed 
that axes of fully irreducible automorphisms are strongly contracting. In 
the course of our proof we will generalize this result (see Proposition 16. 2p 
which states that all folding paths are contracting provided their projections 
to J-" travel with definite speed. In [5] a certain non-canonical hyperbolic 
Out{¥) complex was constructed. It is also known that fully irreducible 
automorphisms act on T with positive translation length [201 [5] . 

Below is a partial dictionary between Teichmiiller space and Outer space 
relevant to this work. 
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The paper is organized as follows. In Section [2] we review the basic no- 
tions about Outer space, including the Lipschitz metric, train tracks, and 
folding paths. Section [3] proves the analog of the inequality dc{ct,/3) < 
i{a,(3) + 1, using the Whitehead algorithm. Section H] contains some addi- 
tional material on folding paths, including the formula for the derivative of 
a length function, as well as the key fact that a simple loop which is largely 
illegal must lose a fraction of its illegal turns after a definite progress in T. In 
Section [5] we define the (left and right) projections of a free factor to a fold- 
ing path and establish that images in T of folding paths are reparametrized 
quasi-geodesics. The key technical lemma in that section is Lemma 15.91 es- 
tablishing that the two projections are at bounded distance when measured 
in J-". We end this section with a very useful method of estimating where 
the projections lie in Lemma [5. 121 In Section [6] we recall the argument that 
for hyperbolicity it suffices to establish the Thin Triangle condition, and we 
also derive the contraction property of folding paths, measured in J^. In 
Section [7] we prove the Fellow Travelers property (which of course follows 
from the Thin Triangle property), in both parallel and anti-parallel setting. 
Finally, in Section [8] we establish the Thin Triangles property. 

The proofs of the three main technical statements in the paper, namely 
Lemma 14.61 Proposition 15. 9( and Lemma I5.1H should be omitted on the 
first reading. 

Acknowledgements. We thank the American Institute of Mathematics 
and the organizers and participants of the Workshop on Outer space in 
October 2010 for a fruitful exchange of ideas. We particularly thank Michael 
Handel for telling us a proof of Lemma 15.11 We also thank Saul Schleimer 
for an inspiring conversation and to Yael Algom-Kfir for her comments on 
an earlier version of this paper. 

2 Review 

In this section we review some definitions and collect standard facts about 
Outer space, Lipschitz metric, train tracks, and folding paths. 

Outer space. A graph is a cell complex G of dimension < 1. The rose 
Rn of rank n is the graph with one 0-cell (vertex) and n 1-cells (edges). A 
marking of a graph G is a homotopy equivalence g : Rn — )• G. A metric on G 
is a function I that to each edge e assigns a positive number £{e). We often 
view the graph G as the path metric space in which each edge e has length 



^(e). The Unprojectivized Outer space X is the space of equivalence classes 
of triples (G, g, t) where G is a finite graph with no vertices of valence < 2, g 
is a marking of G, and ^ is a metric on G. Two triples {G, g, £) and (C, g' , £') 
are equivalent if there is a homeomorphism h : G ^ G' that preserves edge- 
lengths and commutes with the markings up to homotopy. Outer space X 
is the space of projective classes of such (equivalence classes of) triples, i.e. 
modulo scaling the metric. Equivalently, X is the space of triples as above 
where the metric is normalized so that the volume vol{G) := X]^(e) = 1. 
Assigning length to an edge is interpreted as a metric on the graph with 
that edge collapsed, and in this way X becomes a complex of simplices 
with missing faces (the missing faces correspond to collapsing nontrivial 
loops), which then induces the simplicial topology on X. Outer space was 
introduced by Culler and Vogtmann [9], who showed that X is contractible. 
We will usually suppress markings and the metric and talk about G ^ X . 
It is sometimes convenient to pass to the universal cover and regard G (z X 
as an action of F on the tree G. 

We find the following notation useful. If z is a nontrivial conjugacy 
class, it can be viewed as a loop in the rose Rn, and via the marking can be 
transported to a unique immersed loop in any marked graph G. This loop 
will be denoted by z\G. The length of this loop, i.e. the sum of the lengths 
of edges crossed by the loop, counting multiplicities, is denoted by i{z\G). 
Note that if z is not simple, then i{z\G) > 2 vol{G) and in fact z\G must 
cross every edge at least twice. 

Morphisms between trees and train tracks. Recall that a morphism 
between two M-trees S*, T is a map (p : S ^- T such that every segment 
[x, y] C S can be subdivided into subintervals on which (p is an isometric 
embedding. For simplicity, in this paper we work only with simplicial metric 
trees. A direction at x € S" is a germ of non-degenerate segments [x, y] with 
y ^ X. The set D^ of directions at x can be thought of as the unit tangent 
space; a morphism (p : S ^ T determines a map Dcpx ■ D^ — >■ -Dt, -,, thought 
of as the derivative. A turn at x is an unordered pair of distinct directions 
at X. A turn {d,d'} at x is illegal (with respect to (p) if D(j)x{d) = D(f)x{d')- 
Otherwise the turn is legal. There is an equivalence relation on D^ where 
d ^ d' \i and only \i d = d' or {d, d'} is illegal. The equivalence classes are 
called gates. If at each x G S there are at least two gates, the collection of 
equivalence relations at each x is called a train track structure on S. This 
is equivalent to the requirement that (p embeds each edge of S and has at 
least two gates at every vertex. A path in S is legal if it makes only legal 



turns. 

If S and T are equipped with abstract train track structures (equivalence 
relation on D^ for every vertex x with at least two gates), we say that a 
morphism : S* — >■ T is a train track map if on each edge (p is an embedding 
and legal turns are sent to legal turn4j. In particular, legal paths map to 
legal paths. 

We also extend this terminology to maps between graphs. If (^ : A — )• S 
is a map between connected graphs such that the lift (/> : A — > S is a 
morphism of trees, we can define the notion of legal and illegal turns on 
A, which descends to A. If there are at least two gates at each point, we 
have a train track structure on A. If A and S are equipped with abstract 
train track structures, the map 4> is a, train track map if it sends edges to 
legal paths and legal turns to legal turns. When the graphs A or E are not 
connected we work with components separately. 

Lipschitz metric and optimal maps. Let G and G' be two points in X 
normalized so the volume is 1. The homotopy class of maps h : G ^ G' 
such that hg ~ g' (with g, g' markings for G, G') is called the difference 
of markings. The Lipschitz distance between G and G' is the log of the 
minimal Lipschitz constant over all difference of markings maps G ^ G' . 
For more information, see [UlElll]- The basic fact, that plays the role of 
Teichmiiller's theorem in Teichmiiller theory, is the following statement, due 
to Tad White. For a proof see [H] or [4]. 

Proposition 2.1. Let G,G' S X. There is a difference of markings map 
(f) : G —^ G' with the following properties. 

• <j) sends each edge of G to an immersed path (or a point) with constant 
speed (called the slope of (j) on that edge). 

• The union of all edges of G on which (j) has the maximal slope, say 
X, is a subgraph of G with no vertices of valence 1. This subgraph is 
called the tension graph, denoted A = lS.{cj)). 

• (j) induces a train track structure on A. 

The last bullet says that when A is rescaled by A and : AA — )■ G' 
is lifted (on each component) to the universal cover so it becomes a mor- 
phism of trees, each vertex has at least two gates. Note that the Lipschitz 
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constant of <?!> is A so dx{G,G') < log A. But in fact equality holds since 
the presence of at least two gates at each vertex of A guarantees that it 
contains legal loops, whose length gets stretched by precisely A so there can 
be no better map homotopic to (j). We call any (j) satisfying the Proposition 
an optimal map. Unfortunately, unlike Teichmiiller maps, optimal maps are 
not uniquely determined by G, G' . 

A legal loop can be constructed in A by starting with an edge and 
extending it inductively to longer legal edge paths until some oriented edge 
is repeated. In fact, this guarantees the existence of a "short" legal path. 
We will say a loop in A is a candidate if it is either embedded, or it forms the 
figure 8, or it forms a "dumbbell". Every candidate determines a conjugacy 
class that generates a free factor of rank 1. Thus A admits a legal candidate. 
See [n]. 

Folding at speed 1. Now assume that S,T represent points of Unpro- 
jectivized Outer space X (i.e. they are universal covers of marked metric 
graphs), and (/> : 5" — t- T is an equivariant morphism. Equivariantly subdi- 
vide S so that (j) embeds every edge. Choose some e > smaller than half 
of the length of any edge in S. Then for t G [0, e] define the tree St as the 
quotient of S by the equivalence relation: u ^t v if and only if there is a 
vertex x with d{x,u) = d{x,v) < t and (/>(n) = (f){v). Then St represents a 

point in X and <j) factors as S ^^ St — ? T for some morphism <j)too and the 
quotient map 0ot '■ S ^ St, which is also a morphism. The trees St, t S [0, e] 
form a path in X, and Sq = S. We say that this path is obtained from S by 
"folding all illegal turns at speed 1" with respect to (p. 

If (j) induces only one gate at some vertex v & S, then St will have a 
valence 1 vertex for t > 0. In that case we always pass to the minimal subtree 
of St- When (j) induces a train track structure on S, St is automatically 
minimal (if S is). For simplicity we state the following Proposition in the 
train track situation only. 

Proposition 2.2. Let <j) : S ^ T be a m,orphism between two trees in X 
inducing a train track structure on S. There is a (continuous) path St in 
X, t ^ [0,oo), and there are morphisms (pst ■ Sg ^ St for s < t so that the 
following holds: 

1. Sq = S, St = T for t large, 

2. 4>tt = Id, (j)su = 4>tu4>st for s <t <u, 
3- 4'ot = 4' CLnd (ptt' = Id for large t < t' , 



4- each 4>st isometrically embeds edges and induces at least two gates at 
every vertex of Ss , 

5. for s < t,t' the illegal turns at vertices of Sg with respect to (pst coincide 
with those with respect to (psf, so Ss has a well-defined train track 
structure. 

6. for every s < t there is e > so that Sg+r, t € [0, e] is obtained from 
Sg by folding all illegal turns at speed 1 with respect to (p^f 

Moreover, this path is unique. 

Proof. Uniqueness is clear from the definition of folding at speed 1. There 
can be no last time s so that two paths satisfying the above conditions agree 
(including the maps (j)tt') up to Ss but no further, by item 6. 

There are three methods to establish existence, and they will be only 
sketched. 

I2.21 A. Stallings' Method. This works when S and T can be sub- 
divided so that (f) is simplicial and all edge lengths are rational (or fixed 
multiples of rational numbers) . In our applications we can arrange that this 
assumption holds. Then we may subdivide further so that all edge lengths 
are equal. The path St is then obtained exactly as in the Stallings' beautiful 
paper [25], by inductively identifying any pair of edges with a common ver- 
tex that map to the same edge in T. This operation of elementary folding 
can be performed continuously to yield a 1-parameter family of trees, i.e. 
a path, between the original tree and the folded tree. Putting these paths 
together gives the path St- 

I2.21 B. Via the vertical thickening of the graph of (j). This method 
is due to Skora [24] , who built on the ideas of Steiner [26] . Skora's preprint 
was never published; the interested reader may find the details in [8]. Con- 
sider the graph of </> as a subset of SxT and define the "vertical t-thickening" 
of it as 

Wt = {{x,y)eSxT\d{4>{x),y)<t} 

Next, consider the decomposition Dt of Wt into the path components of the 
sets Wt r\ S X {y}, y ^ T. Let St = Wt/T>t be the decomposition space 
with the metric defined as follows. A path in Wt is linear if its projection 
to both S and T has constant speed (possibly speed 0). A piecewise linear 
path 7 in Wt is taut if the preimages 7~^(^) of leaves in Dt are connected. 
Then define the distance in St as the length of the projection to T of any 
piecewise linear taut path connecting the corresponding leaves in Wt- In this 



way St becomes a metric tree. The morphisms <j)st are induced by inclusion 
Ws ^ Wt. 

I2.21 C. Via integrating the speed 1 folding direction. Starting 
at S consider the path St, t € [0, e] obtained by folding all illegal turns at 
speed 1. Now extend this path by folding all illegal turns of S^ at speed 1. 
Continue in this way inductively, and show that either T is reached in finitely 
many steps, or there is a well defined limiting tree, from which folding can 
proceed. This is the approach taken in [11], to which the reader is referred 
for further discussion. 

One possible approach is as follows. Say St is defined for t € [0, to) with 
So = S. To define the limiting tree St^, note that for each conjugacy class, 
the length along the path is nonincreasing and thus converges. The limiting 
length function defines a tree, representing a point in compactified Outer 
space. The lengths of conjugacy classes are bounded below by their values 
in T, so the limiting tree Sjq is free simplicial and thus represents a point in 
Outer space. We may view the tree Stj, as the equivariant Gromov-Hausdorff 
limit of the path St- The maps S ^ St, viewed as subsets of S x St via their 
graphs, subconverge to a morphism S — > St^, and similarly by a diagonal 
argument one constructs morphisms St — ?> St^ that compose correctly. To 
show uniqueness of such morphisms, one uses Gromov-Hausdorff limits and 
the fact that the only (equivariant) morphism 5i(, -^ S'ip is the identity. D 

The following lemmas are stated for clarity and their proofs are left as 
exercises. 

Lemma 2.3. Let St be the path obtained from a morphism S ^>- T by folding 
all illegal turns at speed 1 and let R ^ S be a morphism, so that we have 
morphisms R —^ St obtained by composing. Then the tree obtained from R 
by folding all illegal turns at speed 1 for time t with respect to R ^ Sf , 
t' > t, coincides with the tree obtained from R by folding all illegal turns at 
speed 1 for time t with respect to R ^ T . 

If A is a nontrivial finitely generated subgroup of F and T € <Y, denote 
by A\T the minimal A-invariant subtree of T. 

Lemma 2.4. Let St be the path obtained from S ^ T, A a finitely generated 
subgroup of ¥ and R ^ S a morphism. Let Rt be the path obtained from 
R ^ T. If A\R —7- A\S is an isomorphism, so is A\Rt -^ A\St. 

If St is a path constructed above and A < F is a finitely generated 
subgroup, A\St is also a path obtained by folding illegal turns with speed 1, 
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but here often the train track condition fails and one must pass to minimal 
subtrees. 

Folding paths in X. We define a folding path in X to be the projection of a 
path in X from Proposition l2.2i We will also refer to paths from Proposition 
12.21 as folding paths (in X). The train track condition will always hold, 
except when we pass to subgroups of F. 

Proposition 2.5 (|llj). Let G, S € ^. There is a geodesic from G to Ti 
which is the concatenation of two paths, the first is a linear path in a single 
simplex, and the second is a folding path. 

Proof. Fix an optimal map (p : G ^ T, and let A = A(i^) C G be its tension 
graph. If A = G then (j) : AG -^ S (where AG is G scaled so that the slope 
of ^ is 1 on every edge) is a morphism when lifted to the universal cover 
and it satisfies the requirement that it induces a train track structure on G. 
The folding path of Proposition 12.21 gives a folding path Gt in X from G to 
S. To see that this path is a geodesic, note that each cpst : Gs — >■ Gt is an 
optimal map, so dx{Gs,Gt) = logA(0st). Thus (jisu = 4>tu(pst ior s < t < u 
implies dx{Gs,Gu) = dx{Gs, Gt) + dx{Gt, Gu). 

Now suppose A 7^ G. Let A be the slope of (j) on the edges of A. Denote 
by ei,--- , efc the (topological) edges of G outside of A. For each tuple 
X = (2;i,2;2, • • • ,Xk) of lengths in the cube [0,£i] x [0,^2] x • • • x [0,^/^, where 
li is the length of ej in G, denote by ii{x) the smallest maximal slope among 
maps G ^Ti that are homotopic to (j) rel A, where G is given the metric x 
outside A (so /i(x) = 00 if some loop is assigned length 0). Among all x in 
the cube with /i(x) = A choose one with the smallest sum of the coordinates, 
say xq. Denote by G' the graph G with the metric xq outside of A. (Some 
edges may get length and the rescaled G' is then on the boundary of 
the original simplex.) Let (/)o : G' — )■ S be a map homotopic to (j) rel A, 
linear on edges, and with the maximal slope A((/>o) = A. Now it is clear 
that A((/)o) = G (otherwise some edge length can be reduced contradicting 
the choice of xq) and that ^q induces at least two gates at every vertex 
(otherwise cf) may be perturbed so that the tension graph becomes a proper 
subgraph, see e.g. the proof of Proposition 12.11 in [3]) and since we have 
maps G ^ G' and G' — ?> S with slopes 1 and A respectively, we also have 
dx{G,J:) = dx{G,G') + dx{G',^). a 

A folding path Gt can be parametrized by arc-length, so that dx{Gs, Gt) = 
t — s for s < t. 
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3 Detecting boundedness in the free factor com- 
plex T 

In this section we define a coarse projection tx : X ^ T and prove an analog 
of the inequahty dc{a,f3) < 1 + i{a,f3) (see Lemma [3.2p . An immediate 
consequence is that vr is coarsely Lipschitz (see Corollary I3.4D . 

Recall that a nontrivial conjugacy class x is simple if it is contained in a 
proper free factor. When x is any nontrivial conjugacy class and G a marked 
graph, we denote by x\G the unique immersed loop in G that represents x. 

If X is a simple class, denote by x the smallest free factor containing x. 

Note that any proper connected subgraph P of a marked graph G that 
contains a circle defines a vertex P of J^. 

Lemma 3.1. Let G be a marked graph. If P,Q C G are two proper con- 
nected subgraphs defining free factors P, Q then djr[P^ Q) < 4. 

Proof. If rank(F) = 2 then djr{P,Q) < 1 (using the modified definition of 
T). Now assume rank(F) > 3. Enlarge P,Q to connected graphs P',Q' 
that contain all but one edge of G. Thus their intersection contains a circle 
R, so we have a path in T given by subgraphs P,P',R,Q',Q and we see 
tZ^(A0)<4. D 

Now define a multivalued function vr : ^ — )■ J-" by 

7r{G) = {P\P CG} 

By Lemma 13.11 the diameter of each tt{G) is bounded by 4, and vr may be 
viewed as a coarsely defined function. We refer to vr as the coarse projection 
from X to T. 

Lemma 3.2. Let G be a marked graph and x a simple class. If x\G crosses 
an edge e k times, then the distance in T between x and some free factor 
represented by a subgraph of G is < 6k + 9. 

Proof. First assume that e is nonseparating. By collapsing a maximal 
tree in G that does not contain e we may assume that G is a rose. Let 
ai, a2, • • • , ttm, c be the associated basis with c corresponding to e and set 
A = (ai, • • • ,am)- Thus c appears in the cyclic word for x k times. If 
the Whitehead graph of x is disconnected, consider a 1-edge blowup G of 
G so that X realized in G is contained in a proper subgraph. In this case 
djr{x, A) < 5hy Lemma [3. II (4 for the distance between A and the free factor 
determined by the image of x, and 1 more to get to x). If the Whitehead 
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graph is connected then it has a cut point [27^ 121]. Let (p be the associated 
Whitehead automorphism. If the special letter is some of then the free 
factor A is ^-invariant. If the special letter is c then djr{A,(f){A)) < 6 
{djr(A, (c)) < 3 and (c) is fixed by (p). But there are at most k automor- 
phisms of the latter kind in the process of reducing x until its Whitehead 
graph is disconnected. Thus djix, A) <&k + 5. 

Now assume e is separating. By collapsing a maximal tree on each side 
of e we may assume that G is the disjoint union of two roses Ra and Rb 
connected by e. Let «!,••• , a„ and 6i, • • • , 6m be the bases determined by 
Ra and Rb respectively. Notice that the assumption about e means that 
there are k times when the cyclic word for x changes from the Oj's to the 
6j's or vice versa {k is necessarily even here). If the Whitehead graph of x 
with respect to ai, • • • , a„, 6i, • • • , 6m is disconnected, we see as above that 
djr{x, ^) < 5 where A = {ai, • • • , a„). Otherwise there is a cut point and let 
<j) be the associated Whitehead automorphism. If the special letter is some 
a- then A is (/)-invariant. Likewise, 0(A) is conjugate to A if the special 
letter is 6 ■ and all the a- are on one side of the cut. If they are not on one 
side of the cut, then the subgraph spanned by the a- 's is disconnected and 
we may consider the associated 1-edge blowup Ra of Ra- Let G be the 1- 
edge blowup of G obtained by attaching cURb to Ra along either of the two 
vertices. The blowup edge e' can be crossed by x only if it is immediately 
followed or preceded by e (but not both). Thus x crosses e' at most k times. 
If e' is nonseparating then by the first paragraph dj^{x, P) < 6A; + 5 for some 
subgraph P C G and so djr[x, (6i, • • • , 6m)) < 6/c + 9. If e' is separating 
replace A by a smaller free factor A' and continue. D 

We introduce the following notation: 

• when G, G' G <Y, dj^{G, G') := suppp/ dj^{P, P'), where the sup is over 
subgraphs P C G,P' C G' , and P is the free factor determined by P, 

• when G & X and A € J-" is a vertex, dj^{A,G) = dj^{G,A) := 
suppdjr{A,P), 

• when G G <Y, X is a simple class, djr{G,x) = djr{x,G) := djr{x,G), 
where x is the smallest free factor containing x, 



• 



d^{A, x) = d^{x, A) := d^{A, x) 



For example. Lemma |3. II and Lemma 13.21 combine to give the following. 

Lemma 3.3. Let x be a simple class and G £ X so that x\G crosses some 
edge < k tim,es. Then dj^{G,x) < 6A; + 13. 
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Corollary 3.4. Ifdx{G,G') < logK then djr{G,G') < 12K + 32. 

Proof. Let z he a candidate that realizes dx{G,G'). Thus (■{z\G) < 2, z\G 
crosses some edge once and £{z\G') < 2K, so z\G' crosses some edge < 2K 
times. Therefore 

d^iG, G') < d^iG, z) + dr{z, G') < 19 + {12K + 13) = 12K + 32 

D 

In most of the paper we will be concerned with showing that distances 
in F are bounded. We will use the obvious terminology: In Corollary 13.41 
we showed that the distance in T between projections of graphs from X are 
bounded as a function of the distance in X. When we say a distance in 
J- is bounded without any variables, we mean by a universal constant that 
depends only on the rank n. 

Note that Corollary 13.41 says that vr : A^ — )• J-" is coarsely Lipschitz: If 
dx{G,G') < N with A^ an integer, then djr{G,G') < CN for a universal 
C > 0. Indeed, choose a geodesic from G to G' and apply the Corollary N 
times to pairs of points at distance < 1. 

By the injectivity radius injrad{G) of a metric graph G we mean the 
shortest length of an embedded loop in G. If ^ is a finitely generated 
subgroup of F and G (z X we denote by ^|G the core of the covering space 
of G corresponding to A. Thus there is a canonical immersion A\G -^ G. 
We will endow A\G with the induced metric structure (and the induced train 
track structure if G has one) . 

Corollary 3.5. If injrad{A\G) <k + l then djr{A, G) < 6k + 14. 

Coarse paths TT{Gt) in T obtained from folding paths by projecting to 
J-" will play a crucial role. 

Corollary 3.6. The projections to J- of the folding paths in X is a coarsely 
transitive family: for any two free factors A,B there is a folding path Gt, 
t £ [a,uj], such that A € TT{Ga) and B G 7r(Gi^). The same is true for the 
subcollection consisting of folding paths induced by morphisms that satisfy 
the rationality condition from the Stallings method of folding in \2.S[ A. 

Proof. Let A, B be two free factors. Choose G, S G ^ so that some subgraph 
of G represents A and some subgraph of S represents -B, and so that G is 
a rose, and apply Proposition 12.21 The initial path keeps the underlying 
graph a rose and its coarse projection is still A. 
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To achieve rationality, choose S to be a rose with rational edge lengths. 
Let i;^ : G ^ S be an optimal map after adjusting the metric so that A{(j)) = 
G. If the vertex of G maps to the vertex of S, rationality is automatic. 
Otherwise the vertex of G maps to a point in the interior of some edge. 
Perturb <j) so that this point is rational, and adjust the edge lengths in S 
so that the perturbed map is optimal, with the same train track structure. 
The new map satisfies rationality. D 

4 More on folding paths 

We now discuss folding in more detail. Let Gt, t G [a,(^] be a folding path 
in X (from now on we replace trees by quotient graphs). So for s < t we 
have maps (f)st : Gs ^ Gt that have slope 1 on each edge, they immerse each 
edge, and induce train track structures. By construction, all illegal turns 
are folded with speed 1. 

Number of illegal turns. If a graph G is equipped with a train track 
structure, by the number of illegal turns in G we mean 

where the sum is over all vertices v oi G and all gates Qy at v. Thus a gate 
that contains k>l directions contributes k — 1 to the count. This choice is 
explained by the fact that then the right derivative of the function 

t i-^- vol{Gt) 

at t = tQ is — m(Go), the negative of the number of illegal turns in Gt^. The 
defining interval [a,uj] can be subdivided as a = sq < si < ■ ■ ■ < Sk = uj 
so that on each half-open interval [sj,Sj_|_i) the number of illegal turns is 
constant. We will sometimes abbreviate m{Gt) by simply mt. 

Unfolding. Traversing a folding path in reverse is unfolding. Given an 
immersed path 7 in G^j, one may try to "lift" it along the folding path, 
i.e. to find immersed paths 7t in Gt that map to 7 (up to homotopy rel 
endpoints). This is always possible, since it is clearly possible locally. At 
discrete times new illegal turns may appear inside the path. Note that 
at discrete times the lifts are not unique, when an endpoint of the path 
coincides with the vertex of an illegal turn, which then unfolds the direction 
of the path. Figures [1] and [2] illustrate the nonuniqueness of lifts. 
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Figure 1: The figure illustrates the ambiguity in Hfting paths under un- 
folding. The bottom part is a local fold and the top part is a path being 
lifted. 



To get uniqueness, we can remove the end of the path that lifts nonuniquely. 
Thus we may have to remove segments at the ends whose size grows at speed 
1. Now suppose there are illegal turns in the path. As we unfold, each il- 
legal turn makes the length of the path grow with speed 2, and the illegal 
turn closest to an end moves away from the end at speed 1. We deduce 
that lifting is unique between the first and last illegal turns along the path 
7, including the germs of directions beyond these turns. We call this the 
unfolding principle. 

In particular, this applies to illegal turns themselves: if a loop z\Gco has 
two occurrences of the same illegal turn, pulling back these turns produces 
two occurrences of the same path (most of the time a neighborhood of a 
single illegal turn, but see Figure [3|) . 

But note that distinct illegal turns might pull back to the same illegal 
turn, see Figure HI 

It is useful to have a picture of what unfolding looks like locally, near a 
vertex. Let f be a vertex of Gt and N a small closed connected neighborhood 
around v (so N is the cone on a finite set). We describe the preimage iV^ of 
N in Gt-e for a small e > 0. We will also equip N^ with a height function. 
First place the preimages of v in N^ at height 0. They get identified under 
the fold to Gt, so there is a collection of F's (pair of arcs of length e joined 
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Figure 2: Another example of nonuniqueness of lifting under unfolding. 




Figure 3: An illegal turn lifts to a path with two illegal turns (just as in 
Figure [2D. 



at an illegal turn) in N^ whose union is a tree and so that the associated 
folds identify the preimages of v. We will place these V^s upside down so 
that the interior vertices are at height e. By a widget we will mean the 
union of all V^s with a common interior vertex. Each of the height e vertices 
has a unique direction not contained in any of the widgets, and we draw 
this direction upwards. Height vertices may have additional directions not 
contained in the V^s, and we draw those downwards. See Figure [5l 

All illegal turns in N^ appear at the vertices at height e and these vertices 
have two gates (all downward directions form one gate and the single upward 
direction is the other gate) . All turns at the height vertices are legal. After 
the widgets are folded, in Gt, the height vertices get identified to v, each 
widget contributes an upward direction at v, and the downward directions 



17 






Figure 4: In the folding path indicated the number of ihegal turns grows 
from 1 to 2. 




Figure 5: An example of N,: with 3 widgets, 3 vertices at height e and 5 
vertices at height 0. 



at V come from downward directions in N^ based at height vertices. Some 
pairs of these directions may be illegal in Gt, but they have to come from 
directions in N^ that don't form a turn (i.e. they are based at different 
vertices) . 

In the next Lemma we consider a folding path in X parametrized by 
arc- length, and with graphs normalized to have volume 1. The proof is left 
to the reader. 

Lemma 4.1. Let z\Go be any immersed loop, itlq is the num,ber of illegal 
turns in Gq and ko the number of illegal turns in z\Go. Then the right 
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derivative of the length function 1 1— )• i{z\Gt) at t = is 

±i{z\Gt)\t=o+=i{z\Go)-2^ 
at mo 

Similarly, the right derivative at of the length of a legal segment whose 



endpoints are illegal turns is Lq , where Lq is the length of the segment 



Notice that for some e > the number of illegal turns in Gt and in z\Gt 
are constant for t € [0, e). Therefore we have that 

i{z\Gt) = ae* + h 

on this interval, for certain constants a, h that can be computed from (.{z\Gq) 
and -^(.{z\Gt)\t=Q+ ■ Thus if if is another conjugacy class and -^i{z\Gt)\t=Q+ > 
-^f.{w\Gt)\t=Q+ then (.{z\Gt) — (.{w\Gt) is nondecreasing on [0, e). 

If the average length of a maximal legal segment va. z\Gq is > 2 /mo the 
loop grows in length, and if it is < 2/mQ it shrinks. In particular, legal loops 
grow exponentially. 



Corollary 4.2. A legal segment of length Lq > 2 inside z\Go gives rise to 
a legal segment of length Lt > 2 + (Lq — 2)e* inside z\Gt- In particular, a 
legal segment of length Lq>2, grows exponentially. 

Proof {Lt - 2)' >Lt-2. D 

Definition 4.3. An immersed path or a loop in a metric graph G equipped 
with a train track structure is illegal if it does not contain a legal segment 
of length 3. 

By a surface relation we mean a conjugacy class that with respect to 
some rose crosses every edge twice and has a circle as its Whitehead graph 
(equivalently, attaching a 2-cell to the rose via the curve results in a surface) . 

Lemma 4.4. Let Gt, t € [a,a;] be a folding path with at least m illegal turns 
at every Gt, and assume w\Ga is a loop with m illegal turns. If (.{w\G^) < K 
then either 

(i) djr{Ga,Guj) is bounded by a function of K, or 

(a) w is a surface relation. 

Moreover, if Gt has at least m-\-l illegal turns for all t, then (i) holds, and 
in fact the path Gt has bounded length. 
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Proof. Prom the derivative formula it follows that i{w\Ga) < iQax{2, K} 
(loops of length > 2 with no more illegal turns than in Gt grow under 
folding), so it follows that djr[Ga, G^) is bounded from Lemma [3.3l provided 
vj is simple. We now consider four cases. 

Case 1. i{w\Ga) < 2. Then w is simple as w\Ga crosses some edge at 
most once. 

Case 2. £{w\Ga) = 2. Then either w is simple or w\Ga crosses every 
edge exactly twice. In the latter case, collapse a maximal tree in Ga - with 
respect to the resulting rose the Whitehead graph of w is either a circle (and 
then w is a surface relation) or the disjoint union of at least two circles (and 
then w is simple). 

Case 3. 2 < £{w\Ga) < 2 + injrad{Ga)- Then either w is simple or w\Gci 
crosses every edge at least twice. Assume the latter. Under our assumption 
the edges crossed more than twice form a forest. Collapse a maximal tree 
that contains this forest and argue as in Case 2. 

Case 4- £{w\Ga) > 2 + injrad{Ga)- Choose a loop v\Gci with l{v\Ga) = 
injrad{Ga)- We now claim that l!.{v\Gt) < l{w\Gt) — 2 for all t. This is 
clearly true at t = a. Let t = to be the last time this is true and assume 
to < a;. The derivative formula shows that 

j/{v\Gt)\,^^+ < l{v\Gt,) < £{w\Gt,) - 2 < ±i{w\Gt)\,^,+ 

and so the inequality continues to hold for t > to (see the paragraph af- 
ter Lemma [4.ip . Thus i; is a simple class with both i{v\Ga) and i{v\Gui) 
bounded, so djr[Ga,G^) is bounded. 

For the moreover part, we have -^l{w\Gt) > l{w\Gt) — 2^^xi for all t. 
Thus if i{w\Ga) < 2 then w is simple and the statement follows, and if 
£{w\Ga) > 2, then the function f i->- i{w\Gt) — 2^^^ grows exponentially 
fast starting with value > ^^^x at t = a, and bounded length of w at t = co 
implies that the path Gt has bounded length. D 

We also have the following variant. 

Lemma 4.5. Suppose in addition to the hypotheses of Lemma \4.4\ that for 
some illegal turn in w\Ga one of the two edges e forming the turn is non- 
separating and has length a definite fraction p > of the injectivity radius 
injrad{Ga) of Ga- Then either: 

(i') dj^{Ga,Guj) is bounded as a function of K and the fraction p, or 

(ii') w is a surface relation and any loop z\Ga that contains a segment 
S = e- ■ ■ e that closes up to w\Ga fails to be simple. 
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Proof. In Cases 1 and 4 above conclusion (i') holds, so assume we are in 
Cases 2 or 3. In fact we are free to assume i{w\Ga) < 2 +p injrad{Ga), 
for otherwise the argument of Case 4 shows that for a suitable embedded 
loop V we have £{v\Gt) < for any t and (i') follows (and this time 

the bound depends on p). Now the forest consisting of the edges crossed by 
w\Ga more than twice (assuming all edges are crossed at least twice) does 
not include e, and we may collapse a maximal tree that contains this forest 
but does not contain e. Now z\Ga can be thought of as e • • • e ■ ■ ■ = eAeB 
with the subpath S = eA giving w. Since e is not collapsed, the Whitehead 
graph of z in the rose contains the Whitehead graph of w, which is a 1- 
manifold. So if w is not simple, neither is z. D 

Let m denote the maximal possible number of illegal turns for any train 
track structure on any G € X. The following lemma generalizes [21 Lemma 
2.10]. Before stating it, we need a bit of terminology. Consider a folding 
path Gt, t G [a,uj] and a curve z\Gt. The illegal turns along z are folding 
as t increases, but at discrete times an illegal turn may become legal, or 
several illegal turns may collide and become one (e.g. see Figure [3]). We say 
that a consecutive collection of illegal turns along z survives to G^j if none of 
them become legal nor do they collide with a neighboring illegal turn in the 
collection, for any t E [a, oj]. In particular, each illegal turn in the collection, 
at any Gt, unfolds to a single illegal turn. 

Lemma 4.6. Let z be a simple class and Gt a folding path. Assume that 
M = 2m + 1 consecutive illegal turns of z\Ga survive to G^ and that the 
legal segments between them in z\G^ have bounded size. Then djr{Ga,Gi^) 
is bounded. 

Proof. For each t denote by 7j the set of turns that occur in the given 
consecutive collection in z\Gt, and let Dt the set of directions in Gt that 
occur as a part of a turn in %. Of course, Dt is partitioned into equivalence 
classes with respect to the relation "being in the same gate" , but we consider 
a finer equivalence relation generated hy d ^ d' if {d, d'} is an illegal turn in 
Tt. We will call the equivalence classes subgates. Each subgate, say at the 
vertex v, gives rise to a "Whitehead graph": the vertices of the graph are 
the directions in the subgate, and an edge is drawn between d and d' if the 
turn {d, d'} occurs in %■ 

Sublemma. Let di,d2,-'' ydk be the vertices along a simple closed curve 
in the Whitehead graph of a subgate (that is, di ^ dj for i ^ j) at Gt . Then 
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for any s < t there is an induced simple closed curve in the Whitehead graph 
of a subgate of Gs ■ 

Specifically, each turn {(ij,(ij+i} (taken mod fc, so including {dkidi}) 
gives rise via the Unfolding Principle to an illegal turn in Gg ; the Sublemma 
says that these turns form a simple closed curve in a subgate (in particular, 
all are based at the same vertex). 

Proof of Sublemma. Note that it suffices to argue in the case s = t — e for 
small e > 0; for the conclusion of the Sublemma clearly holds in the limit. 
To that end, we refer to the discussion just before Lemma 14.11 Each di 
determines either an upward direction at a height e vertex, or a downward 
direction at a height vertex - we call this direction di. The segment in N^ 
determined by di and dj+i must have a single illegal turn. We now make 
the following observations. 

(1) If di is a downward direction, it maps to di. 

(2) If di an upward direction, then all directions pointing upward at height 
of that widget are mapped to di. See Figure |6l 




Figure 6: Possible di and dj, and the associated directions that map to di 



and d 



']■ 



(3) It is not possible for a single widget to support both an upward di and 
a downward dj. This is because dj would form an illegal turn with the 
upward direction within the widget, since the two map to dj and di 
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(which form an ihegal turn). However, ah illegal turns occur at height 
e. 

(4) It is not possible for adjacent widgets to both support upward di and dj. 
This is because this would force an illegal turn at the common vertex 
formed by directions that map to di and dj. 

(5) It is not possible for two downward directions di and dj to be based at 
the same height vertex. 

Suppose we have 3 consecutive directions di-i,di,di^i that determine 
two illegal turns in A^^ not based at the same vertex. There are two cases. 
If di is downward, say based at the height vertex w, paths to di±i lead 
through two distinct (but adjacent) widgets. Since we are considering a 
closed curve in a tree, there must be some j with j^i^j + lso that the 
path from dj to dj+i passes through w. This path cannot terminate at di, 
and by (5) it cannot terminate at w at all, but must continue to another 
(adjacent) widget. By (3) the widgets containing w do not support upward 
dj and dj^i, so the path crosses two illegal turns, contradiction. The other 
case is that di is upward, say based at a height e vertex w inside a widget 
W. Then there are distinct widgets W+ and W- adjacent to W so that the 
path from di to di±i crosses an illegal turn in W±. Again there must be 
some j so that the path from dj to dj+i either crosses w (if W+ H W- = 0) 
or crosses the intersection point W^ D W- (if there is one, and then this 
point is in W as well). In the latter case the path does not terminate at 
this point by (3) nor at an upward direction in W± by (4). Thus this path 
has two illegal turns, contradiction. In the former case, the path from dj to 
dj^i must have at least 3 illegal turns: one at w and one on each side of w, 
by (3) and (4). 

Now we have established that the vertex with the illegal turn crossed by 
the path from di to dj+i is independent of i, call it w. It remains to observe 
that the path from di to dj also crosses an illegal turn at w and no others, 
even when |i — j| > 1. Indeed, if di and dj are in the same direction from 
w, then they have to either be both downward at the same height vertex 
of the widget W containing w, contradicting (5), or one is downward and 
the other upward in the same widget adjacent to W, contradicting (3), or 
they are upward in widgets adjacent to W and to each other, contradicting 
(4). D 

We now continue the proof of Lemma 14.61 We first consider the equiva- 
lence relation on the set of turns % in our consecutive collection generated 
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by n ^ T2 if there is a simple closed curve in the Whitehead graph of a 
subgate containing ti and T2. In other words, the equivalence classes are 
obtained from Whitehead graphs of subgates by cutting open along cut ver- 
tices - equivalent turns are represented by edges in the same component. 
According to the Sublemma, unfolding equivalent turns produces equivalent 
turns, and in particular the vertices where the turns are based coincide. 
Next, note that the number of equivalence classes is always less than or 
equal to the number of illegal turns (with equality only if subgates coincide 
with gates and their Whitehead graphs are trees). Under unfolding, distinct 
illegal turns might unfold to the same illegal turn, so the number might 
decrease. By subdividing [a, w] into a bounded number of intervals and re- 
naming we may assume that the number of illegal turns in 7t is constant. 
Likewise, we may assume that the number of equivalence classes is constant 
(in general it may decrease under unfolding when a new circle is formed). 
We now consider two cases. 

Case 1. Some subgate contains a circle in its Whitehead graph at time 
Lo. Then, by the Sublemma, the same is true at every Gt- Choose two 
equivalent illegal turns at time w in 7^ so that the number of illegal turns 
between them is smaller than the number of equivalence classes. Thus the 
curve obtained by closing up this segment has strictly fewer illegal turns 
than m{Guj), and this continues to hold after unfolding in every Gt- The 
conjugacy class w represented by this curve has bounded length at all Gt 
with number of illegal turns strictly smaller than m{Gt)- By the last sentence 
of Lemma 14.41 djr{Ga,Gu)) is bounded. 

Case 2. There are no circles in Whitehead graphs of subgates at time 
oj. Then the number of illegal turns in 7* does not exceed m{G^). If an 
embedded circle appears in the Whitehead graph of some subgate at some 
time t we use Case 1 discussion on [a, t] , so we will assume that there are no 
circles for any t. Thus the number s of illegal turns in our collection never 
exceeds m{Gt)- 

First assume that there are two occurrences of the same illegal turn in 
the consecutive collection at time uj that are separated by < s — 1 illegal 
turns. Closing up gives a curve with < s illegal turns, so again the conclusion 
follows from the last sentence of Lemma 14. 4[ 

So from now on we assume that this does not happen, i.e. all s illegal 
turns occur repeatedly in a cyclic order in the consecutive collection. If it so 
happens that one of these illegal turns at t = a involves an edge e which is 
nonseparating and has length a definite fraction p of injrad{Ga), we argue 
using Lemma H31 Let w\Ga be the loop obtained by closing up the segment 
that starts with e and ends at the next occurrence of the same illegal turn. If 
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the edge following this segment is e, we can appeal to Lemma IT5] to deduce 
that djr(Ga,Gui) is bounded (because z is simple). If the edge following 
the segment is not e, then the last edge of the segment is e and closing up 
forces cancelation. If the tightened loop has length < 2, it is simple and its 
image in G^^ is bounded, so djr{Ga,G^) is bounded. If the tightened loop 
has length > 2, then the original segment has length > 2 + 2p injrad{Ga) 
and the same argument as in Case 4 of Lemma [4.41 (see also proof of Lemma 
14. 5p shows that for l{v\Ga) = injrad{Ga) we must have l{v\G^) bounded, 
so djr{Ga-,Gu)) is again bounded. 

So now all that remains is reducing to the case where this technical 
condition about the edge e is satisfied. 

As a warmup, first consider the case where m{Gt) = s for all i, i.e. all 
illegal turns in Gt appear in our collection. Let /3 E [a,w] be the first time 
that a nonseparating edge of length > ■^j^_^ injrad{Ga) is involved in an 
illegal turn. If there is no such /3 then clearly dj^{Ga, G^) < 8 (their coarse 
projections to F intersect). Then djr{Ga-,Gp) < 8 (the same reason) and 
djr[Gj3,G^) is bounded since at Gp our technical condition holds. 

Also recall that the case m{Gt) > s for all t can be dealt with using 
Lemma 14.41 as before. It remains to consider the hybrid case, when m[Gt) 
sometimes equals s and sometimes it is > s. There are two arguments for 
this, as follows. 

First argument. If l['w\Ga) < 2 then w is simple and we are done. 
Assuming i{w\Ga) > 2 we see that on subintervals of [q, uj] where m{Gt) > s 
the derivative of i{w\Gt) is bounded below by a number > 1. Since £{w\Gi^) 
is bounded, the total length of the path on such intervals is also bounded. 
The argument in case m{Gt) = s fails if the illegal turn involving the long 
edge e does not appear in our collection. So in this case necessarily m{Gt) > 
s on a subinterval on which the path Gt has definite length. Therefore this 
can happen only a bounded number of times, and the path Gt has bounded 
projection between these occurrences by the arguments above. 

Second argument. It is convenient to revert to the language of mor- 
phisms, i.e. we do not rescale graphs obtained by folding. Let G[, t' € [a, w'] 
be the folding path obtained from the morphism Ga -^ G^j by folding only 
the s illegal turns that occur in our collection (so G'^ = Ga)- There is an 
induced morphism (p '■ G'^^i -^ Gui- Since djr{Ga,G'^,) is bounded by the 
special case m{G't) = s, it remains to argue that dj^{G'^,,Gui) is bounded. 
Note that (p immerses the segment spanned by our collection of illegal turns. 
If the length of the subsegment bounded by two consecutive occurrences of 
the same turn is < 2 vol{G'^,), then the loop obtained by closing up is simple 
and has bounded length in all Gt and Gj, so we are done. If the length of 
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this segment is > 2 vol{G'^,) then the distance in X from (rescaled) G'^, to 
Gi^ is bounded since the image segment in G^j has bounded length. D 

Lemma 4.7. Let z be a simple class and Gt a folding path. Assume z\Gt is 
illegal for allt. Then either l{z\G^) < £{z\Ga)/2 or dj^{Ga,Gi_j) is bounded. 

Proof. There are two cases. First suppose that the average distance between 
consecutive illegal turns in z\Ga is > 1/m. Then by Lemma 14.61 after a 
bounded progress in T the loop z must lose at least 1/M of its illegal turns. 
Repeating this a bounded number of times, we see that after a bounded 
progress in J-" the loop z will have less than l/6m of the illegal turns it had 
at Ga- Thus either the length of z at that point is less than 1/2 of its initial 
length or the average distance between illegal turns is > 3, so there is a legal 
segment. 

Now suppose the average distance between illegal turns in z\Ga is < 1/m. 
By Lemma 14.11 the right derivative of the length of z is 



|w,)l..,;=n.|0,)-2^ 



where fco is the number of illegal turns in z\Gtg. We are assuming that 
ko/niQ > i{z\Gtf)) so the above derivative is < —£{z\Gto). Thus in this case 
the length of z decreases exponentially, until either half the length is lost 
after a bounded distance in X, or the average distance between illegal turns 
becomes > 1/m, when the above argument finishes the proof. D 

Remark 4.8. One source of asymmetry between legality and illegality is that 
a long legal segment gets predictably longer under folding, while a long illegal 
segment may not get longer under unfolding. For example, take a surface 
relation inside a subgraph where folding amounts to an axis of a surface 
automorphism. But the lemma above implies that an illegal segment inside 
a simple loop will get predictably longer under unfolding after a definite 
progress in J^. 

5 Projection to a folding path 

We thank Michael Handel for pointing out the technique for proving the 
following lemma (see |121 Proposition 8.1]). 

Lemma 5.1. Let G £ X be a metric graph with a train track structure and 
A <¥ a free factor. Suppose A\G satisfies: 
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• there is an illegal loop a in A\G, 

• there is an immersed legal segment in A\G of length 3(6ri — 6), n = 
rank(F). 

Then dj^{A, G) is bounded. 

Proof. If the injectivity radius of A\G is < 3(6n — 6) the conclusion follows 
from Corollary 13.51 

Choose a complementary free factor B to A and add a wedge of circles 
to j4|G representing B to get a graph H. Extend A|G — )■ G to a homotopy 
equivalence (difference of markings) H ^ G, which is an immersion on each 
1-cell. Pull back the metric to H and consider a folding path from H to G. 
Let H' be the first graph on this folding path with injectivity radius 3(6n— 6). 
Since A\G —?■ H' is an immersion, the interior of the legal segment in A\G 
embeds in H' . Now H' has at most 3n — 3 edges and 6n — 6 half-edges, so 
there is a legal segment of length 3 inside one of the edges. Thus the image 
of a does not contain this edge and hence djr{a, H') < 5. Since dj-(a. A) < 1 
and dj^{H', G) is bounded by Corollary 13.51 the statement follows. D 

The following is a slight generalization. 

Lemma 5.2. Let G ^ X he a metric graph with a train track structure and 
A <¥ a free factor. Suppose A\G satisfies: 

• there is a loop a in A\G with the maximal number of pairwise disjoint 
legal segments of length 3 hounded by N , 

• there is an immersed legal segment in A\G of length 3(6ri — 6), n = 
rank(F). 

Then dj^{A, G) is bounded as a function of N. 

Proof. The proof is similar. The loop a\H' crosses an edge at most N times, 
so Lemma 13.31 applies. D 

Now let Gt, t G [a,uj] be a folding path. We usually parametrize it by 
arc-length. Let A be the conjugacy class of a free factor. 

A folding path has a natural orientation given by the parametrization. 
We will think of this orientation as going left to right. 

By / denote the number 

/= (6m + 6)(6n-6) 



27 



where m is the maximal possible number of illegal turns in any G ^ X (so 
rh is some linear function of the rank) . Here 6n — 6 is the maximal number 
of half-edges in a graph in X, while 6m + 6 is a number that comes out of 
Lemma 15.81 below when applied to edge length 4. This choice of / is used in 
the key Proposition 15.91 

Definition 5.3. The left projection \g^{A) of A to Gt is: 

Ag((j4) = inf{i I A\Gt has an immersed legal segment of length 3} 

The right projection pctiA) of A to Gt is: 

PGti^) = sup{t I A\Gt has an immersed illegal segment of length /} 

Recall here that a segment is illegal if it does not contain a legal subsegment 
of length 3. 

For simplicity, we will also denote by Xcti^), PCti^) the graph Gt with 

t = XGM),pGM)- 

We make analogous definitions for any simple class a, so e.g. Xctia) 
is the first time a\Gt contains a legal segment of length 3 (here wrapping 
around is allowed, i.e. a\Gt may be legal and of length < 3). 

If the above sets are empty, we interpret inf as uj and sup as a. Note 
that the first set is closed under the operation of increasing t. Clearly, 

Xg^{A)<pgM)- 

Proposition 5.4. Suppose B < A are free factors. Then 

. XgM) < >^gAB) and pgM) > PgAB), 

• either dx{XGtiA), XGt{B)) is bounded or the distance in T between A 
and any graph along [XGt{A),XGtiB)] is bounded. 

Proof. The first bullet is clear since we are taking inf and sup over smaller 
sets. Let X{A) = AGt(A), X{B) = XGt{B), and denote by X'{A) the first 
time along the folding path that A\Gt has a legal segment of length 3(6n — 
6). Then X{A) < X'{A), and the distance dx iX{A) , X' (A)) is bounded by 
Corollary |42l If X{B) < X'{A) we are done, so suppose A'(^) < X{B). 
It follows from Lemma 1 5. II that the set of Gt's for t G [X'{A),X{B)] has a 
bounded projection in J-", and the projection is close to A. D 

Given a constant K > 0, we will say that a coarse path 7 : [a,uj] -^ T 
is a reparametrized quasi- geodesic if there is a subdivision a = to < ^i < 
■ ■ ■ < tm = ^ such that diamjr('y{[ti,ti+i])) < K, m < djr{^(a),'y{uj)), 
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and \i — j\ < dj^{'y{ti),j{tj)) + 2 for all i,j. A coarse Lipschitz function 
/ : X — > y between metric spaces is one that satisfies dy(/(a;i), /(a;2)) < 
K dx{xi,X2) + K for all x\^xi E X. A function /:X— >^CXisa coarse 
retraction if d(a, /(a)) < K for all a G ^. In all these cases, / is allowed to 
be multivalued with the bound of K on the diameter of a point image. 

Corollary 5.5. For any folding path Gt the projection 

T^TT{Gt) 

is a coarse Lipschitz retraction with constants independent of A and Gf. 
Consequently, paths iT{Gt) are reparametrized quasi- geodesies in T . 

Proof. That the map is coarsely Lipschitz follows from Proposition 15.41 To 
prove that it is a coarse retraction, we need to argue that 7r(AG't(7r(Go))) 
is bounded distance from 7r(Go). Let a be the conjugacy class of a legal 
candidate in Go, so A(a) < and dj^{a,Go) is bounded. We will argue 
that dj^{X{a),G()) is bounded. Let to be the smallest parameter such that 
a|GjQ is legal. Then ^(ajGto) < 2 so djr{GtQ, Go) is bounded. Now note that 
A (a) = to since for t < to any legal segment of length 3 in a\Gt would force 
£{a\Gt,) > 3. 

The argument for the second part is from [?]• Let Gt be a folding path 
so that '7r(Gj) is a coarse path joining free factors A and B. Choose a 
geodesic Gj, z = 0, • • • , tti of free factors joining Gq = A and Gm = B in T. 
Consider the coarse projection Di of Gj to 7r(Gt). By Proposition 15.41 the 
diameter of the segment bounded by Di and -Dj+i is uniformly bounded. 
Now the Dj's may not occur monotonically along 7r(Gt). To fix this, let 
ii < ^2 < • • • < ifc be the sequence defined inductively by ii = and ij+i is 
the smallest index such that Di .^-^ occurs after Di . in the order on Gt given 
by t. Then by construction the interval between Di . and Di .^-^ has uniformly 
bounded diameter and the number k is bounded by m = dj^{A,B). Call 
a subdivision satisfying these properties a admissible. To ensure the last 
property \i — j\ < dj^{'y{ti),'y{tj)) + 2 take an admissible subdivision with 
minimal k. D 

Definition 5.6. For k > 0, G > we say a folding path Gt makes (k, G)- 
definite progress in T if for any D > and s < t, dx{Gs,Gt) > Dk + G 
implies dj^{Gs,Gt) > D. 
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Corollary 5.7. For any folding path Gt the projection 

T-R{Gt)^{Gt} 

where R{Gt) is the set of free factors at a certain bounded distance from 
Gt measured in T , is coarsely Lipschitz (with respect to the path metric in 
F-R{Gt)). 

Moreover, the projection is coarsely defined and coarsely Lipschitz on 
all of J- provided Gt makes {k,C)- definite progress in T (with constants 
depending on k, G). D 

Lemma 5.8. Let Gt be a folding path, t S [a,'^]; and let A be a free factor. 

The length of any illegal path contained in a topological edge of A\Guj is less 

than 

3 

-m ■ edgelength{A\G a) + 6 

where m is the maximal number of illegal turns in any Gt and edgelength{A\Ga) 
is the maximal length of an edge in A\Ga- 

Proof. Fix an illegal path of length L in the interior of a topological edge 
of ^IGj^. We will assume that the endpoints are illegal turns and argue 
L < |m • edgelength{A\Ga)', after adding < 3 on each end we recover any 
illegal path. By the Unfolding Principle this path lifts to an illegal path 
bounded by illegal turns inside some topological edge of A\Ga-, so has length 
Lq < edgelength{A\Ga) ■ Let to be the first time the right derivative of the 
length of the path is non-negative (if such to does not exist then L < Lq). 
Thus the length at to is < Lq and the average length of a maximal legal 
segment inside the path is > 2/m(Gi(,) > 2/m. If the length of path grows 
by a factor > 3m/2 the average length of a legal segment is guaranteed to 
be > 3. D 

Proposition 5.9. Let Gt be a folding path and A a free factor. Assume that 
A\Ga has a legal segment of length 3, and that A\Gaj has an illegal segment 
of length L. Then djr(Ga,Gi_j) is bounded. 

Proof. First note that by replacing Gq with Ga+t for a bounded t, we may 
assume that ^|Gq has a legal segment of length 5(6n — 6). Assuming the 
distance dj^{Ga,Gu}) is large, let r G [a,uj] be chosen so that dj^{Ga,Gr), 
djr{Gr,Gui) and dj^{Gr,A) are all large. Wedge on a rose representing a 
complementary free factor to A\Gr to get a graph H' and a map (difference 
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of markings) H' — ?• Gr which is an isometric immersion on every edge. If 
H' has bounded injectivity radius then dj^{A, Gr) is bounded, contradicting 
the choice of Gr- So suppose the injectivity radius of H' is large. Now fold 
H' towards Gr until a graph H is reached which is the last time there is an 
edge E of length 4. By folding further we may assume that the complement 
of E immerses to Gr- 

We observe that we may assume that the complement of E does not 
have a valence 1 vertex. Indeed, assuming otherwise, with respect to the 
map H -^ Gr there are two possibilities for the illegal turns (see Figure 
[7|). In the left picture, the length of E stays 4 under folding. We continue 
folding until the separating edge folds in with E, and this is our new H. The 
right picture is impossible: £^ is a "monogon" (of length 4) and the folding 
towards Gr stops before the loop degenerates. But this means that Gr has 
volume > 2. 





Figure 7: Two possibilities when E is a loop attached to a separating edge. 
The square represents the remainder of the graph. 



We will also assume for concreteness that the complement of E is con- 
nected, and denote by B the free factor determined by it. When the com- 
plement is disconnected, there are two free factors determined by the com- 
ponents. The changes are straightforward and left to the reader. 

A turn in H is either degenerate if the two directions map to the same 
direction hy H ^ Gr, or non-degenerate, and in the latter case it is legal or 
illegal, depending on its image in Gr- The same terminology will be applied 
to turns in Ht with respect to ipt '- Ht ^ Gt constructed below. Note that H 
may have illegal turns based at points in the interior of a topological edge. 
The degenerate turns must involve a direction at an endpoint of E, pointing 
into E- 

There are two cases. 

Case 1. E contains a legal segment of length 3, so the illegal segments 
in A\Gr do not cross E. As Gr folds towards G^j we will fold H producing a 
path (though usually not a folding path) Ht- To describe Ht it is convenient 
to view the folding path Gt as in Proposition 12. 2| i.e. without rescaling and 
folding with speed 1 (and we will not rename the parametrizing interval). 
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Define Ht as the graph obtained from H by folding all illegal turns with speed 
1 with respect to the morphism H —^ G^^ (the composition H -^ Gr -^ G^) 
for time t — t, the same time it takes Gr to fold to Gt- By Lemma 12.31 
this is equivalent to folding with respect to H ^- Gt- Alternatively, one 
can describe Hf directly in terms of local moves. Suppose Ht^ is defined for 
some T <tQ < oj and the following conditions hold for t = to: 

• Hr = H, 

• Ht = B\GtU Et, where Et is a single edge, 

• the immersion B\Gt — >■ Gt extends to a morphism (difference of mark- 
ings) ipt ■ Ht ^ Gt which is an isometric immersion on Et, 

• every nontrivial conjugacy class in A is represented by a loop in Ht 
whose V't-image is immersed (equivalently, loops in A\Ht do not cross 
degenerate turns). 

For small e > define Ht, t E [io, ^o + e]i by identifying pairs of segments 
of length t representing degenerate turns in Ht^ and illegal turns. That this 
local description produces a global path Ht, t E [t,uj\ can be argued as in 
I2.2I C (and indeed Ht is a subpath of the folding path induced by if — > G^ 
as indicated above). It is also clear from the local description that if the 
above properties hold for t = to then they hold for t S [io , io + c] for small 
e > 0. These properties also hold in the limit (for the second bullet one 
needs to argue that Et does not collapse to a point in the limit, but that 
follows from Corollarv l4.2p . 

Now note that the length of any illegal segment contained in an edge of 
Ht (with respect to the train track structure obtained by pulling back via 
ipt) is uniformly bounded by (6m + 6)vol{Gt), by Lemma 15.81 

Now rescale Gt and Ht by dividing the metric with vol{Gt)- So either the 
injectivity radius of Ht is bounded at time t (in which case the projection 
of Gt is bounded distance away from B, and hence the projection of Gr), or 
there are no illegal paths in Ht of length > I = (6m + 6)(6n — 6) and hence 
the same holds for A\Gt- Applying this to t = uj we see that djr{Gr,Gi^) is 
bounded, contradicting the choice of Gr- 

We remark that we can prevent Et from becoming a loop attached along 
a separating edge by agreeing not to fold the two directions at the endpoints 
of Et if they form a degenerate turn and there are no other degenerate turns. 

Case 2. E doesn't contain a legal segment of length 3, so the legal 
segments in A\Gr do not cross E- We will produce a path Ht, t G [a,T] 
satisfying: 
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• Hr = H, 

• Hf = B\GtU Et, where Ef is a single edge, 

• the immersion B\Gt -^ Gt extends to a morphism (difference of mark- 
ings) ^t ■ Elt —^ Gt which is an isometric immersion on Ef, 

• every conjugacy class in A is represented by a loop in Ht whose tpt- 
image is immersed. 

We define Ht by going backwards from t = r to t = a. Assuming Htg 
is defined satisfying the conditions above with to > o, we define Ht for 
t G [to — e, to] via the following local operations. 

We will define Ht as B\Gt with an edge Et attached, and we specify the 
attaching points. Consider first the case that a direction e emanating from 
an endpoint v of Et^ forms a degenerate turn (with respect to iptg) with 
a direction b in B\GtQ (such a direction is then unique). Intuitively, as t 
decreases, B\Gt unfolds and we choose to fold b and e with speed 1. A more 
elaborate description follows. 

Let <p = (j)ttQ : Gt -^ Gt^ be the folding map. This induces a folding map 
(pB '■ B\Gt -^ B\Gto. The set 4>~^ (v) = {vi, ■ ■ ■ ,Vk} may consist of more 
than one point, all of which fold together in B\GtQ. At each Vi there is a 
unique direction 6j that maps to 6, and moving in those directions distance 
to — t produces a unique point vt, and this is one of the two attaching points 
for Et . See Figure El 





Gt 







Figure 8: To make the attaching point for Et not depend on choices, we 
have to fold with the 6-direction while the other illegal turns get unfolded. 

In the language of widgets, there are two possibilities. If b induces an 
upward direction b from some vertex w at height to — t, then the bi are the 
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upward directions based at height vertices of the widget containing w, and 
we have vt = w. If 6 induces a downward direction b based at a height 
vertex, we let vt be the vertex in the direction of b at height —{to — t) (in 
this case we did this just for consistency, height would have worked just 
as weh). 

Now suppose that the direction e does not degenerate with any direction 
in B\Gto. Then there is a natural way to construct the attaching point in 
Gt by watching GtQ unfold to Gt- More precisely, the union (a "star") of 
2e-segments emanating from v in the directions e and B\Gto embeds in Gt,j. 
The preimage of this star in Gt is a tree, and there is a natural way of 
choosing the attaching point vt G B\Gt so that this tree lifts to B\GtU Ef. 
See Figure [H 









Figure 9: To decide where to attach Et mimic what happens in Gt- 

There is a unique homotopy class of paths in Gt connecting the images of 
attaching points whose image in GtQ is obtained by pre- and post-composing 
tptoiEto) with paths of length < e. Define tpt on Et to be the immersed path 
in this class and choose the metric on Et so that this map is isometric. 

Now suppose Ht is defined for t G (io, t] and we want to define Ht^- We 
will argue that length functions in Ht converge as t — > to- The derivative 
of a length function for the conjugacy class 7 with respect to negative time 
is the difference between the number of illegal turns (these are unfolding) 
and the number of degenerate turns (these are folding) ^\Ht crosses. Since 
the number of degenerate turns can only drop and the number of illegal 
turns can only grow, we need only argue that the length of 7 does not go to 
infinity as t — >■ io- 

If 7 represents a conjugacy class in A, its length in Ht is equal to its 
length in Gt, so the limit is ^(7|Gip). In particular, the length of Et stays 
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bounded assuming some conjugacy class in A crosses it (if not, 7r(Gr) is 
within bounded distance of A contradicting the choice of Gr)- It follows 
that the volume of Ht stays bounded, since the volume of B\Gt converges 
to the volume of B\GtQ. Now ^\Gt is obtained from jlHt by folding the 
degenerate turns. The number of such turns is bounded by twice the number 
of times 'y\Ht crosses Ef (which is constant in t), and the amount of folding 
at each degenerate turn is bounded by the volume of Ht- Thus the difference 
i{'y\Ht) — £{'y\Gt) is bounded, so £{'y\Ht) stays bounded. 

This shows that there is a limiting graph Hfg. We also need to argue 
that Et does not degenerate to a point at t = to, and this follows from the 
Unfolding Principle. The map Ht^ — > Gt^ is obtained by taking limits. That 
the above properties hold in the limit is immediate. 

If the injectivity radius of the (rescaled) Ht is > 5(6n — 6) then there can 
be no immersed legal segments of length 5(6n — 6) (they would give a legal 
segment of length 5 inside some edge, and this would imply that an edge of 
Hr = H has length > 5). Thus the injectivity radius of Ht is bounded for 
t G [a,T] and hence djr{Ga,Gr) is bounded, again a contradiction. D 

Corollary 5.10. The image of the interval [XctiA], pcti^)] ^^ J' has uni- 
formly bounded diameter, independently of A and Gt ■ D 

Proof. The endpoints have bounded ^-"-distance by Lemma[5]9l and therefore 
the whole interval projects to a bounded set since vr(Gj) is a reparametrized 
quasi-geodesic for t € [Xcti-^)-, PgA-^)] (with uniform constants). See Corol- 
lary [531 □ 

We will now see a way to estimate where the projection lies. First, a 
general fact that in a different form appears in pQ. 

Lemma 5.11. There is a constant Gn > so that the following holds. 
Suppose G,H ^ X, z is any class (not necessarily simple), and K > Q. 
If i{z\G) > GnKl{z\H) then there is a simple class u such that djr(H,u) 
is bounded and u\G has a segment of length K in common with z\G (here 
we allow wrapping around u\G, i.e. taking segments in high powers u \G). 
Moreover, £{u\H) < 2. 

Proof. Fix a map (p : H ^ G so that each edge is immersed (or collapsed) 
and each vertex has at least two gates (e.g. first change the metric on H as 
in Proposition 12.51 so that the tension graph is all of H, but in the rest of 
the proof we use the original metric) . 

In the proof we will not keep track of exact constants, but will talk about 
"long segments in common with z\G" . For example, suppose an edge path 

35 



A in H is the concatenation A = BC of two sub-edge paths, and we look 
at the tightened image [(/>(^)], which is the tightening of the composition 
[(/>(i?)][(/>(C)]. If [(/'(^)] contains a long segment in common with z\G, then so 
does [(/'(-B)] or [(/'(C)] (or both), but "long" in the conclusion means about a 
half of "long" in the assumption. The number of times this argument takes 
place will be bounded, and at the end the length can be taken as large as 
we want by choosing the original length (i.e. C„) large. 

Represent z\H as a composition of ~ i{z\H) paths Zj where each zi is 
either an edge, or a combinatorially long (but of length < 1) path contained 
in the thin subgraph (union of immersed loops of small length) . Consider the 
tightened images [(/)(zi)]. Thus the loop z\G is the composition of ~ i{z\H) 
paths. In the process of cancelation everything must cancel except for a 
(possibly degenerate) segment ai C [0(zi)] in each path, and at least one 
such segment must have length >~ CnK. So we conclude that for some Zi, 
[(t>{zi)] contains a long segment in common with z\G. There are now two 
cases, depending on whether zi is contained in the thin part or is an edge. 

Case 1. There is an edge e so that (/)(e) contains a segment of length 
~ CnK in common with z\G. Start extending e to a legal edge path until 
an edge is repeated. There are several possibilities. 

Type 0. The first repetition is e itself, i.e. we have e..e. Then closing up 
on e gives a legal loop u that crosses each edge at most once, and this loop 
satisfies the conclusion. 

Type 1. The first repetition is either e~^ or another edge with reversed 
orientation, i.e. e..e~^ or e..a..a~^ . Schematically we picture this as a mono- 
gon. Note that there are two ways to traverse the monogon starting with e 
and ending with e~^^ both legal. 

Type 2. The first repetition is an edge a different from e and with the 
same orientation, i.e. e..a..a. We picture this as a spiral. 

We can also extend e in the opposite direction until an edge repeats, and 
so we have three subcases. 

Subcase 1. Type 1-1, i.e. we have Type 1 on both sides. 





Figure 10: Loop of type 1-1. 
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If there is an edge b (different from e) crossed by both monogons, form a 
legal loop u that crosses e exactly once, by switching to the other monogon 
after crossing b (when b is crossed by the tails, we may have to cross b one 
more time, depending on orientations). This loop satisfies the conclusion. 
Otherwise, if no edge is crossed twice by the monogons, the loop u that 
traverses both monogons is legal, crosses each edge at most twice, and crosses 
some edge once, so it satisfies the conclusion. 

Subcase 2. Type 1-2, i.e. we have a monogon on one side and a spiral 
on the other. 





Figure 11: Loop of type 1-2. 

The loop V in the spiral is legal, so if some edge in it maps to a path that 
contains a long piece of z\G we can take this loop for u. Thus we assume 
this does not happen. If some edge b different from e is crossed by both 
the spiral and the monogon, we can form a legal loop u that crosses e once 
as above, and we are done. Otherwise, let u be the loop that crosses both 
the spiral and the monogon, so it has (potentially) one illegal turn. We 
claim that u satisfies the conclusion. Indeed, it crosses some edges once and 
all edges at most twice. Write u as Ev'E~^v~^, where E is the edge-path 
formed by the two tails and v' is the loop of the monogon. Schematically, u 
can be drawn as in Figure [T2l 

Now consider the image 0(u). To see how much cancelation occurs, let V 
be the maximal initial piece of (^{E) which is also an initial piece of (j){v)'^ for 
some A^ > 0. Note that if V contains a long piece of z\G (e.g. iiV = 4>{E)) 
then we may take u = v. Otherwise, y is a proper (possibly trivial) initial 
subpath of 4>{E), and then the path that cancels in (j){u) is exactly V. In 
particular, the part of (p{E) that's left in the tightened (p{u) contains a long 
segment in common with z\G. 

Subcase 3. Type 2-2, i.e. we have two spirals. 

If either of the two legal loops v,v' in the spirals map to loops with long 
segments of z\H, we are done. So assume this does not happen. The first 
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E 




Figure 12: u in Subcase 2. 





Figure 13: Loop of type 2-2. 



case is that the two spirals do not contain any edges in common, except 
for e. Then the loop u that crosses both spirals can be written in the form 
u = Ev'E~^v~^, pictured as a bigon (see Figure [T4j) . The argument is now 
similar to Subcase 2. Consider the maximal initial piece V [V] of (f>{E) [of 
(l){E~^)] that is also an initial piece of 4>{v) [of 4){v') ] for some A^ > 0. If 
either V or V shares a long piece with z|G we may take u = v oi u = v' . 
Otherwise, in particular, V and V do not completely cover 4){E) and the 
canceling paths in the (/>- image of the bigon are V and V\ so what's left has 
a long piece of z|G. 

If some edge 6, other than e, occurs on both spirals we can construct a 
loop that crosses e only once by jumping from one h to the other. If this 
loop is legal, we can take it for u. Otherwise, it has one illegal turn, and 
we may write it as Ew~^ where w is either u or a terminal subpath of v 
(in case b occurs inside v). Let V be the maximal initial piece of (j){E) that 
occurs as an initial piece of (j){v)^ for some A^ > 0. If y has a long piece 
of z\G then u = v works. Otherwise, notice that the path that cancels in 
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Figure 14: u in Subcase 3. 



is an initial segment in 1/, so what's left has a 
1 ^y ,, — i?„,-i,,-i 



either Ew~^ or in Ew~^v 

long piece of z\G. Thus u = Ew~^ or u = Ew^^v"^ will do (by choosing 
the first possible b leaving e even the second loop crosses each edge at most 
twice, and it crosses e once). 

Case 2. There is no edge as in Case 1, but there is a path w in the 
thin part of H of length < 1 such that [(/"(u^)] contains a segment of length 
~ CnK in common with z\G. First, if necessary, concatenate w with a 
combinatorially bounded path in the thin part so that its endpoints coincide. 
After this operation [0(w)] still has a long piece in common with z since (p- 
images of edges do not. If taking u = w does not work, then the path 
[(/)(u')] has the form VUV~^ with V having a long piece in common with 
z\G. Choose a combinatorially short loop a in the thin part, based at the 
endpoints of w. We aim to show that u = aw works. 

Consider the maximal initial segment A of [</'(tt')] and [(/>(a)]~"'^. Note that 
A is also a (possibly degenerate) initial segment of V since we are assuming 
that ^-images of edges do not have long segments in common with z\G. 
Write [(/'(a)] = CA~^, so [(^(atf)] starts with CJ- ■ ■ where V = AJ, and it 
stih ends with V~^ = J^^A~^. See Figure [T5l 



A 



C 



-> 



J 




Figure 15: u in Subcase 3. 
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There are now two cases. In the first case, assume that A is no longer 
than C. In order to have cancelation of long zIG-segments in [0(ti)] we 
must have C = AB for some path B (nontrivial, since [</>(«)] = ABA~^), 
so [(/)(ai(;)] starts with ABJ ■ ■ ■ . After canceling A at the beginning and 
the end of [(?i>(aw;)] we see that BJ and J have a common initial piece that 
contains a long segment of z\G. As in the monogon discussion of Subcase 2 
above, this forces a long segment of z\G in B for some A^ > 0. But then 
we may take u = a since [(/'(a)] = [^.BA"-^] = B. 

The other case is that A is longer than C. Then we may write A = CB, 
and we again see that J and BJ have a common initial piece that contains 
a long segment of z\G. Again take u = a. D 

Lemma 5.12. Let Gt be a folding path, H ^ X , and z a (not necessarily 
simple) class. Assume that i{z\Gr) > i{z\H) and let Cn be the constant 
from Lemma \5.11[ 

(i) If z\Gr is legal then Xctiu) is coarsely to the left of Gr (i.e. Xcti^) < 
const) for some simple loop u with i{u\H) < 2. 

(a) If z\Gr is illegal then pGt{u) is coarsely to the right ofGr, but measured 
in T (i.e. either pQ^{u) > or dj^{Gr, pcti''^)) is bounded) for some 
simple loop u with i{u\H) < 2. 

Proof. We first prove (i). First note that we may assume that i{z\Gr) > 
3Gn^{z\H) by moving from Gr to Gt for a bounded t. Lemma [5 . 1 1 1 provides 
a simple loop u with djr(H, u) bounded so that u\Gr has a segment of length 
3 in common with z\Gr', in particular it contains a legal segment of length 
3 and we are done. 

The proof of (ii) is analogous. First, by moving left from Gr a bounded 
amount in F we may assume l!.{z\Gr) > CnI^{z\H) (see Lemma l4.7p . Then 
Lemma 15.111 provides a simple loop u with dj^{H,u) bounded so that u\Gr 
has a segment of length / in common with z\Gr, which is then illegal. D 

6 Hyperbolicity 

The following proposition provides a blueprint for proving hyperbolicity of 
J^. In the case of the curve complex the same blueprint was used by Bowditch 

m- 

Proposition 6.1. J- is hyperbolic if and only if the following holds for 
projections of folding paths. There is G > so that: 
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(i) (Fellow Travel) Any two projections 'ir{Gt) and iT{Ht) of folding paths 
that start and end "at distance 1" (coarsely interpreted) are in each 
other's HausdorfJ C -neighborhood. 

(a) (Symmetry) If TT^Gt) goes from A to B and iT{Ht) from. B to A then 
the two projections are in each other's Hausdorff C -neighborhood. 

(Hi) (Thin Triangles) Any triangle formed by projections of three folding 
paths is C-thin. 

More precisely, in (i), if G and H are the initial points of the two paths, 
the hypothesis means that there exist adjacent free factors A and B such 
that A G T^{G) and B S vr(i/), and similarly for terminal points. 

Proof. It is clear that (i)-(iii) are necessary for hyperbolicity, since projec- 
tions of folding paths are (reparametrized) quasi-geodesics and in hyperbolic 
spaces quasi-geodesics stay in bounded neighborhoods of geodesies. The con- 
verse is due to Bowditch [7] (a variant was used earlier by Masur-Minsky 
|22j). Here is a sketch. We will show that any loop a in J^ of length L 
bounds a disk of area ~ LlogL. Subdivide a into 3x2^ segments of size 
~ 1 and think of it as a polygon. Subdivide it into triangles in a standard 
way: a big triangle in the middle with vertices 2^ segments apart, then 
iteratively bisect remaining polygonal paths. Represent each diagonal by 
the image of a folding path - up to bounded Hausdorff distance the choices 
are irrelevant. Using Thin Triangles, each triangle of diameter D can be 
filled with a disk of area ~ D. Adding the areas of all the triangles gives 
~iVx 2^ ~LlogL. D 

The following Proposition generalizes Algom-Kfir's result [1]. 

Proposition 6.2. Let H,H' e X with dx{H,H') < M and let Gt be a 
folding path such that dx{H,Gt) > M for all t. Then the distance between 
the projections of H and H' to Gt is uniformly bounded in T . 

Proof. For all candidates z\H denote by Gi the leftmost of all Agj (z) and by 
G2 denote the rightmost of all pctiz). Then the interval [Gi, G2] is bounded 
in J^ by Proposition 15.41 and Corollary 15.101 Let zi\H be a candidate 
that realizes the distance to Gi, so i{zi\Gi) > e^'^£{zi\H) and £{zi\H') < 
e^'^£{zi\H). Combining these inequalities gives i{zi\Gi) > £{zi\H'), so by 
Lemma l5.12( ii) pQ^{H') is coarsely to the right of Gi, measured in F. In 
the same way one argues that XctiH') is coarsely to the left of G2 and the 
claim follows. D 
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Corollary 6.3. A folding line that makes {k^C)- definite progress in T is 
strongly contracting in X (with the constants depending on k, C). D 

This simply means that in the situation of the proposition, projections 
of H and H' to Gt are at a uniformly bounded distance in X (depending on 
K, C), measured from left to right. 

Note that a folding line that makes definite progress in T is necessarily 
in a thick part of X (i.e. the injectivity radius of Gt is bounded below 
by a positive constant). The converse does not hold (but recall that it 
does hold in Teichmiiller space, and the above corollary is the direct analog 
of Minsky's theorem [23] that Teichmiiller geodesies in the thick part are 
strongly contracting). 

Remark 6.4. One can avoid the use of the technical Lemma [4. 6 1 in the proof 
of Proposition [ 



7 Fellow Travelers and Symmetry 

We will fix constants C\, C2, and D from Proposition 15.41 and Proposition 
[Ql so that: 

• li B < A are free factors at J-'-distance > Ci from a folding path Gt 
then the <Y-distance along Gt between Xg^{A) and Xg^{B) is bounded 

by A 

• The J^-diameter of the projection of a path of length M to any folding 
path at distance > M is always < C2, 

• Ci > G2. 

Proposition 7.1. Fix C sufficiently large. Suppose Gt and H^ are two 
folding paths, djr[Gt,Hr) > G for all i,r, hut the initial points and the 
terminal points are at F-distance < IOC. Then the projections of the two 
paths to T are uniformly bounded in diameter. 

The same holds if the initial point of Gt is lOG-close to the terminal 
point of H-T and the terminal point of Gt is \0G -close to the initial point of 

Hr. 

Proof. Subdivide H^ into a minimal number of segments whose J^-diameter 
is bounded by Ci. Say the subdivision points are sq < si < S2 < • • • < Sm- 
Let Gf. = XctiHg.). When C > 2Ci we have that the distance, measured 
from left to right, between Gt^ and Gt^+i along Gt is < GiD (here G > 2Gi 
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is needed so that interpolating free factors are also far from Gt)- The J-- 
distance between Gt^ and the initial point of Gt, and also between Gt^ and 
the terminal point of Gt is bounded. Further, dx{GtQ,Gt^) < mGiD as 
long as GtQ is to the left of Gt^ (if it is to the right, the whole path Gt is 
J^-bounded). So the projection of [GtQ,Gt^] to Hr is bounded by mC2, as 
long as the A:'-distance between Gt and Hr is bounded below by GiD (if not 
then the J^-distance between Gt and Hr is bounded, contradiction when C 
is sufficiently large, see Corollary I3.4p . So, 



mC2 + 2 (const) > (m - l)Ci 

and since Ci > C2 this implies that m is bounded above. The claim follows. 
The proof is analogous in the "anti-parallel" case. D 

Fellow Traveler and Symmetry properties are now immediate. 

Proposition 7.2. Let Gt and Hr be folding paths whose initial points are 
at F-distance < R and the same holds for terminal points. Then vr(Gi) C J- 
and Tr{Hr) C J-" are in each other's bounded Hausdorff neighborhoods, the 
bound depending only on R. 

The same holds when the initial point of ^{Gt) is R-close to the terminal 
point of TT{Hr) and the terminal point of Gt is R-close to the initial point of 

Hr. 

Proof. Let C > i? be a sufficiently large constant as in Proposition 17.11 
If iT{Gt) is not contained in the Hausdorff C-neighborhood of TT{Hr) there 
is a subpath [Gf^,Gf2] such that no point of it is C-close to Tr{Hr), but 
the endpoints Gti,Gt2 are within IOC. Then there is a subpath \Hr^,Hr2\ 
of Hr whose endpoints are within IOC of the endpoints of [Cj^,Ct2] (but 
notice that we don't know in advance if the orientations are parallel or 
anti-parallel). Now by Proposition 17. II the set T:{[Gt-^,Gt2\) is in a bounded 
Hausdorff neighborhood oiTT{Hr). By the same argument Tr{Hr) is contained 
in a bounded Hausdorff neighborhood of 7r(Cj). D 

Remark 7.3. Note that in the situation of the Proposition, any Gt is within 
bounded J^-distance from its projection to Hr. Indeed, if Hr^ is within 
bounded J^-distance from Gt, then from Corollary 15.51 it follows that the 
projection of Gt is within bounded J-'-distance of Hr^. 

Proposition 7.4. Let Gt and Hr be folding paths whose initial points are at 
T- distance < R and the same holds for terminal points. There is a uniform 
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Figure 16: The "ahead" and the "behind" cases. 



bound, depending only on R, to the T -distance between Agj(A) and \h^{A) 
for any free factor A. 

The same holds in the anti-parallel case. 

Proof. For notational simphcity, we may assume that A = (a) is cychc. 
First suppose Gt and H^ are parallel. Modulo interchanging the two paths, 
we can assume that the projection of A to Gt is ahead of the projection to 
Ht, i.e. when Xg^A) is projected to Hr it is (coarsely, in F) to the right of 
\h^{A). By Lemma[5TT2]if £(a|AG't(a)) < ^[o,\PH.r[o)) then the projection of 
AGt(a) to Hr is coarsely to the left of (i.e. behind) PHr{o)i ^"^^ the claim is 
proved. If £(a|AGt(a)) > ^(a|/OH^(a)) then the projection of PHt{(^) to Gt is 
to the right (i.e. ahead) of Xctio), and we are done again. 

Now suppose Gt and H^ are anti-parallel. There are two subcases to 
consider, see Figure [16l where we abbreviate points like XctiA) by A. In the 
first case, the projection of A is ahead, i.e. Xct [A) is (coarsely) ahead of the 
projection to Gt of Xh^{A). This is a symmetric condition with respect to 
interchanging Gt and Hr (by Remark l7.3p . Say £(a|AGt(a)) < £(a|A//^(a)). 
Then the projection of XG^{a) to Hr is coarsely ahead of A/f^(a) by Lemma 
15.121 and the claim follows. 

The "behind" case is similar, but we consider right projections. Say 
l[a\pGt{o)) < ^{'APHr{(^))- Then the projection of PGt[o) to Hr is coarsely 
behind /9//^(o), and the claim again follows. D 

8 Thin Triangles 

Proposition 8.1. Triangles in T made of images of folding paths are uni- 
formly thin. More precisely, ifA,B,C are three free factors coarsely joined 
by images of folding paths AB, AG, EC and G is the projection of G to 



44 



AB, then AC is in a bounded Hausdorff neighborhood of AC and BC is in 
a bounded Hausdorff neighborhood of BC . 

We will consider points U,V,W a X and folding paths Ht from 1/ to VK 
and Ct from UtoW. Denote by P the projection of V to Gt, more precisely 
it is the rightmost of the pcA^) ^s z ranges over the candidates in V . See 
Figure [T71 




Figure 17: A thin triangle. 

We will prove that 'k{V P)yjTT{PW) and 7r(V^l^) are contained in uniform 
Hausdorff neighborhoods of each other (by VP we mean a folding path from 
y to P etc). The basic idea is that VP U PW behaves like a folding path 
and the claim is an instance of the Fellow Traveler Property. 

Claim 1. dx{V, W) > dx{V, P) + dx{P, W) — C for a universal constant 
C. 

To prove the claim, let v\V be a candidate for dxiV^P)- Thus v\P has 
only bounded length illegal subsegments. It follows that after removing the 
1-neighborhood of each illegal turn, definite percentage of the length oi v\P 
remains, and hence 
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for a fixed e > 0. Thus 

£(v\W) 
dx {V,W)> log -j^ >dx{V,P) + dx{P,W) + \oge 

By Q denote the point on VW such that dxiV-iQ) = dxiV-iP)- From 
Claim 1 we see that dx{Q, W) < dx{P, W) < dx{Q, W) + C. 
Claim 2. djr{P^ Q) is bounded. 
This time let f |y be a candidate for dxiV, W). Thus 

£{v\Q) = e'^^^^^^k{v\V) = e'^^^^^Pk{v\V) > £{v\P) 

Let Q' be the point along QW with dx{Q,Q') = log(3(6n — 6)C„) (if such 
a point does not exist, both P and Q are uniformly close to W and we are 
done). Then e{v\Q') = 3C„(6n - 6)elv\Q) > 3C„(6n - 6)e{v\P), so Lemma 
15. 11 1 implies that there is a simple loop p\P of length < 2 such that p\Q' has 
a legal segment of length 3(6n — 6). Now p\Q' cannot contain many disjoint 
legal segments of length 3, for otherwise the length of p at W would grow 
to a much longer length along Q'W than along PW. Now the Claim follows 
from Lemma |5.2[ 



Proof of Proposition \8.1[ By Proposition 17. 21 we are free to replace a folding 
path by another whose endpoints project close to the endpoints of the orig- 
inal, and we are allowed to reverse orientations. By Proposition 17.41 these 
replacements affect the projections by a bounded amount, so that the pro- 
jection (7 of C to AB is coarsely well-defined, independently of the choice 
of a folding path whose projection coarsely connects A and B with either 
orientation. In particular, we may assume that U, V, W project near A, C, B 
respectively and we may consider folding paths UW and VW that end at 
the same graph W. The above discussion then shows that BC is contained 
in a bounded Hausdorff neighborhood of BC. Making analogous choices of 
the folding paths, the claim about AC follows similarly. D 

The following fact shows that for the purposes of this paper the collection 
of folding paths can be replaced by the larger collection of geodesic paths. 

Proposition 8.2. Let V,P,W eX so that dxiV, P)+dx{P, W) = dx{V, W) 
and let V e X be such that dxiV, V) + dx{V', W) = dx{V, W), d^{V, V) 
is hounded, and there is a folding path Ct from V' to W (see Proposition 
\2.5\) . Then the J- -distance between P and some Ct is uniformly bounded. 
Moreover, the correspondence P ^^ Ct can be taken to be monotonic with 
respect to dxiy,P). Consequently, tt{Ht-) is a reparametrized quasi- geodesic 
for any geodesic Hr, with uniform constants. 
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The proof is a variant of the discussion above. 

Proof. Let P' be the point on Gt with dxiV,P') = dx{V,P) (if such a 
point does not exist we assign V' to P). We need to argue that djr{P,P') 
is bounded. Let v\V be a candidate reahzing djr(y, W). Thus v\Gt is legal 
for all t and i{v\P) = i{v\P'). Let Q' be a point on Gt with dx{P\Q') = 
log(3(6n — 6)Cn) (if such a point does not exist, both P and P' are close 
to W and we are done). Then i.{v\Q') = 3C„(6?i — Q)(.{v\P), so Lemma 
15.111 implies that there is a simple loop p\P with i{p\P) < 2 and with p\Q' 
containing a legal segment of length 3(6n — 6). Now p\Q' cannot contain 
many disjoint legal segments of length 3, for otherwise the length of p at W 
would grow to a much longer length along Q'W than along PW. Lemma 
15.21 implies that djr(P,Q'), and therefore dj^{P,P'), is bounded. D 

To summarize: 

Theorem 8.3. J- is 5-hyperbolic. Images of geodesic paths in X are in 
uniform Hausdorff neighborhoods of geodesies with the same endpoints. 

We remark that <l> G Out{¥) has positive translation length in T if and 
only if it is fully irreducible (this follows e.g. from [5j). Moreover, the action 
of Out{¥) on J-" satisfies the Weak Proper Discontinuity (see [6]). 
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